Stochastic Propositionalization of Non-determinate Background Knowledge
نویسندگان
چکیده
It is a well-known fact that propositional learning algorithms require \good" features to perform well in practice. So a major step in data engineering for inductive learning is the construction of good features by domain experts. These features often represent properties of structured objects, where a property typically is the occurrence of a certain substruc-ture having certain properties. To partly automate the process of \feature engineering", we devised an algorithm that searches for features which are deened by such substructures. The algorithm stochastically conducts a top-down search for rst-order clauses, where each clause represents a binary feature. It diiers from existing algorithms in that its search is not class-blind, and that it is capable of considering clauses (\context") of almost arbitrary length (size). Preliminary experiments are favorable, and support the view that this approach is promising.
منابع مشابه
Using Taxonomic Background Knowledge in Propositionalization and Rule Learning
Knowledge representations using semantic web technologies often provide information which translates to explicit term and predicate taxonomies in relational learning. Here we show how to speed up the process of propositionalization of relational data by orders of magnitude, by exploiting such ontologies through a novel refinement operator used in the construction of conjunctive relational featu...
متن کاملRepresentational/Efficiency Issues in Toxicological Knowledge Discovery
We tackle the problem of automatic detection of StructureActivity relationships by means of Inductive Logic Programming (ILP). We describe an algorithm for constructing features from background knowledge (stochastic propositionalization SP) and an abstract level representation for chemical compounds.
متن کاملOn propositionalization for knowledge discovery in relational databases
Propositionalization is a process that leads from relational data and background knowledge to a single-table representation thereof, which serves as the input to widespread systems for knowledge discovery in databases. Systems for propositionalization thus support the analyst during the usually costly phase of data preparation for data mining. Such systems have been applied for more than 15 yea...
متن کاملComparative Evaluation of Approaches to Propositionalization
Propositionalization has already been shown to be a promising approach for robustly and effectively handling relational data sets for knowledge discovery. In this paper, we compare up-to-date methods for propositionalization from two main groups: logic-oriented and databaseoriented techniques. Experiments using several learning tasks — both ILP benchmarks and tasks from recent international dat...
متن کاملEfficiency-conscious propositionalization for relational learning
Systems aiming at discovering interesting knowledge in data, now commonly called data mining systems, are typically employed in nding patterns in a single relational table. Most of mainstream data mining tools are not applicable in the more challenging task of nding knowledge in structured data represented by a multi-relational database. Although a family of methods known as inductive logic pro...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998